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Abstract 

Background: Recently, one genome-wide association study identified a susceptibility locus of rs4331426 on 
chromosome 18q11.2 for tuberculosis in the African population. To validate the significance of this susceptibility 
locus in other areas, we conducted a case-control study in the Chinese population. 

Methods: The present study consisted of 578 cases and 756 controls. The SNP rs4331426 and other six tag SNPs in 
the 100 Kbp up and down stream of rs4331426 on chromosome 18q11.2 were genotyped by using the Taqman- 
based allelic discrimination system. 

Results: As compared with the findings from the African population, genetic variation of the SNP rs4331426 was 
rare among the Chinese. No significant differences were observed in genotypes or allele frequencies of the tag 
SNPs between cases and controls either before or after adjusting for age, sex, education, smoking, and drinking 
history. However, we observed strong linkage disequilibrium of SNPs. Constructed haplotypes within this block 
were linked the altered risks of tuberculosis. For example, in comparison with the common haplotype AA( rs80 87945- 
rsi2456774> haplotypes AG (rs8087 945-rsi 2456774) and GA (rS 8o 8 7945-rsi 2456774) were associated with a decreased risk of 
tuberculosis, with the adjusted odds ratio(95% confidence interval) of 0.34(0.27-0.42) and 0.22(0.16-0.29), 
respectively. 

Conclusions: Susceptibility locus of rs4331426 discovered in the African population could not be validated in the 
Chinese population. None of genetic polymorphisms we genotyped were related to tuberculosis in the single- 
point analysis. However, haplotypes on chromosome 18q11.2 might contribute to an individual's susceptibility. 
More work is necessary to identify the true causative variants of tuberculosis. 



Background 

After two decades of neglect, tuberculosis (TB) is being 
resurrected as a major public health problem, especially 
in low- and middle-income countries [1]. Nearly one 
third of the world's population has been latently infected 
with the pathogen of mycobacterium tuberculosis 
(MTB) [2]. However, only 10% of them will develop 
active TB throughout their lifetimes [3-5]. Although pre- 
vious studies have indicated that susceptibility has a 
substantial genetic component [6-8], progress in the 
determination of contributing genetic variants of TB 
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was slow. With the completion of Human Genome Pro- 
ject and advances in genotyping technology, Genome- 
wide Association (GWA) Study has been one powerful 
tool for the study of genetic susceptibility in human 
complex diseases [9]. Despite the widely held view that 
exposure to pathogens during human evolution has put 
evolutionary pressures on host susceptibility, progress in 
identifying susceptibility genes for infectious diseases 
has been slow in comparison to other common disor- 
ders [10]. Recently, a GWA study in Ghana and Gambia 
identified a susceptibility locus of rs4331426 on chromo- 
some 18qll.2 in association with the risk of TB [odds 
ratio (OR) = 1.19, 95% confidence interval (CI):1.13- 
1.27, P = 6.8 x 10' 9 ] [11]. However, till now this finding 
has not been replicated in other populations. 
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China, the world's second largest country with TB epi- 
demic, has a different genetic background, lifestyle, and 
disease prevalence from Africa [12]. It was estimated 
that 1.3 million people across the country developed 
active TB in 2009, of whom 600 000 had the highly 
infectious form. To validate the findings from the GWA 
study in Ghana and Gambia and to search for more sus- 
ceptibility loci of TB, we performed a case-control study 
via the fine mapping analysis of the region of 18qll.2 in 
the Chinese population. 

Methods 

Study population 

This case-control study was conducted in Jiangsu, a 
developed province located in the eastern part of China, 
with a total population of 77 million in 2009. We 
recruited 578 patients with pulmonary TB, including 
368 (63.7%) new cases and 210 (36.3%) previously trea- 
ted ones. For the definitions of new and previously trea- 
ted cases, we referred to the WHO guidelines. In brief, 
a "new case" was termed as a newly registered episode 
of TB in a patient who, in response to direct question- 
ing, denied having had any prior antituberculosis treat- 
ment (for up to one month); and in study sites where 
adequate documentation was available, there was no evi- 
dence of such history. A "previously treated case" was 
defined as a newly registered episode of TB in a patient 
who, in response to direct questioning admitted having 
been treated for TB for one month or more, or, in study 
sites where adequate documentation was available, there 
was evidence of such history. All patients were diag- 
nosed with the evidence of sputum culture. Sputum 
samples were cultured on Lowenstein-Jensen (LJ) cul- 
ture media. Identification of MTB was done by using 
the p-nitrobenzoic acid (PNB) and thiophene carboxylic 
acid hydrazine (TCH) resistance test. Growth in LJ med- 
ium containing PNB indicated that the bacilli did not 
belong to the MTB complex. We also recruited 756 
controls from a pool of individuals who participated in 
the local community-based health examination program. 
Controls were frequency-matched to the cases by sex 
and age. These control subjects had no self-reported his- 
tory of TB, diabetes and malignancy. All cases and con- 
trols had no prior HIV positive history. Each subject 
was individually interviewed in local health facilities by 
using a structured questionnaire and donated a blood 
sample for genotyping analysis. 

SNPs selection and genotyping 

The significant SNP rs4331426, which was identified in 
a GWA study in Ghana and Gambia, is located on the 
region of chromosome 18qll.2 [11]. Due to the low 
minor allele frequency (MAF) of this SNP (< 5%) among 
Han Chinese, we further searched for tag SNPs around 



it. Firstly, we downloaded all eligible SNPs in the 100 
Kbp up and down stream of the SNP rs4331426 on 
chromosome 18qll.2 by using the Chinese Han popula- 
tion (CHB) database of HapMap http://www.hapmap. 
org/. All SNPs in this region were filtered by using the 
following criteria: (1) MAF>0.05; (2) Hardy- Weinberg 
equilibrium test P value>0.05. Then, tag SNPs were 
selected by using Haploview 4.2 software based on their 
ability to tag surrounding variants [13]. As a result, 7 
SNPs (rs4330012, rs8087945, rsl2456774, rsl2457731, 
rsl2958098, rs4800136 and rs4800417) were chosen for 
genotyping, which was performed by using TaqMan 
allelic discrimination technology on the ABI 7900 Real- 
Time PCR System (Applied Biosystems, Foster City, CA) 
[14]. The primers and probes for each SNP (Table 1) 
were designed by Nanjing Steed BioTechnologies Co., 
Ltd. Due to the technical limitation, we failed to design 
probes for detecting the SNP rs4330012 as another SNP 
rs9954441 neighboring to it. Preliminary experiments 
were carried out for each SNP and blank controls were 
set in each batch of samples. Both the laboratory per- 
sonnel and the readers of genotyping were blinded to 
the status of cases and controls. The overall call rate of 
genotyping was > 95%. 

Statistical analysis 

Data were double entered with EpiData 3.1 (Denmark) 
and discrepancies were checked against the raw data. Con- 
tinuous variables were described as mean ± SD and differ- 
ences between groups were analyzed by using students 
test. Categorized variables were described as percentage 
and analyzed by using the Chi-square test. Unconditional 
logistic regression model was used to calculate odds ratio 
(OR) and 95% confidence interval (CI), as well as corre- 
sponding P-values. Hardy- Weinberg equilibrium was esti- 
mated using the j 1 goodness of fit test among controls. 
Haplotype blocks were selected with Haploview software 
by considering linkage disequilibrium (LD) blocks. The 
estimated frequency of polymorphic loci was calculated 
using PHASE 2.1 software. All analyses were performed 
using the SPSS software (SPSS Inc., USA). The P-value 
reported was two-sided and the values less than 0.05 were 
considered statistically significant. 

Ethical consideration 

This project has been approved by the Institutional 
Review Board of Nanjing Medical University. Written 
informed consents were obtained from all participants. 
Ethics has been respected throughout the whole study 
period. 

Results 

Overall, this study consisted of 578 cases (72.1% males, 
27.9% females) and 756 controls (75.3% male, 24.7% 
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Table 1 Information of primers and probes 


SNPs 


Primer sequence(5'-3') 


Probe sequence 


rs8087945 


F-GGATGACAGAGTGAGTGACACACA 


C: FAM-AGAAAGAATCTFGCACTTT-MGB 


A > G 


R-AGAAGfAGfAAGAAGTGGGTG 


T- HFX-AGAAAGAA 1 1 1 IGGA< 1 1 IA-MGR 

1 . 1 1 1_/\ rA\jr\rA/\^jr\rA 1 1 1 J \JV— 1 1 1 n IV IVJ U 


rsl 2456774 


F-AG( 1 1 1 GATAGCrGAGfATAGG 


C- FAM-GAAGTTAATTfGrGTGTT-MGR 


A > G 


R-TTGGGAATGAA 1 1 1 IGGATGAG 


T- HFX-GrAAnTAATTGGTGTrTT-MGR 

1. 1 ^v^r^r^v^ I i r\t \ 1 1 1 LJ 1 L I I I v 1 D 


rsl 2457731 


F-ACCATCTTAATGTCTTCGATTGAAGT 


G: FAM-CCTTATCCTAGATTCTTAGC-MGB 


A > G 


R-TAGATGCTTTCCTCCAGGAGCTA 


A: HEX-GCTTATCCTAAATTCTTAGC-MGB 


rsl 2958098 


F-TGTGCAAGAGGCTTGAGGCTAT 


A: FAM-CAATTCCCAGCAGTCT-MGB 


C > T 


R-GTAAACCAAAGCTGAGAGTTFCAATT 


G: HEX-GAATFCGCGGGAGTC-MGB 


rs4800136 


F-CACCTGAGCAACAAAAGGAGTG 


G: FAM-CTGTTFGTFGUAATACCT-MGB 


A > C 


R-GACTAGAAGCACAAAAGCCATCTG 


T: HEX-CTCI 1 IGI 1 1 1 IAATACCT-MGB 


rs480041 7 


F-GTGCCAGCTCTGTGGTFAGCT 


G: FAM-TCAACCTTGCCAAGC-MGB 


C > T 


R-TCCCCACTCTTACTCATTFGAAAGA 


A: HEX-TGCACTCAACCTTACCA-MGB 


rs4331426 


F-GGTTTGGATTAACAGGCAAGAAA 


A: FAM-TACCAATGCAGAGGAT-MGB 


A > G 


R-ACCACCTGTTGTAGATGAGAAAATFAAA 


G: HEX-TACGAATGCGCAGGAT-MGB 



female). The age (mean ± SD) was 52.07 ± 18.01 years 
for cases and 52.85 ± 18.42 years for controls, respec- 
tively. As a result of frequency-matching, there were no 
significant differences in the distribution of gender and 
age between cases and controls. However, education 
level and the history of smoking and drinking were 
found to be different between the two groups. As shown 
in Table 2, the proportion of ever smoking was 52.5% in 
patients, which was significantly higher than that in con- 
trols (44.4%) (x 2 = 8.543, P = 0.003). In contrast, the 
proportion of alcohol drinking was 17.9% in TB patients, 
which was significantly lower than that in controls 
(28.4%) (x 2 = 19.675, P < 0.001). 

As expected, genetic variants of rs4331426 were rare 
in the study population. The frequencies of AA, AG, 

Table 2 Basic characteristics of the cases and controls 





case 




control 






(N = 


578) 


(N = 


756) 




Variables 


No. 


% 


No. 


% 


P 


Age (mean ± SD) 


52.07 ± 


52.85 


± 


0.176 




18.01 




18.42 






Gender 










0.199 


Male 


417 


72.1 


569 


75.3 




Female 


161 


27.9 


187 


24.7 




Education 










< 0.001 


Illiterate or semi-illiterate 


164 


28.4 


89 


11.8 




Primary school 


133 


23 


215 


28.4 




Junior high school 


201 


34.8 


267 


35.3 




Senior high school or above 


80 


13.8 


185 


24.5 




Smoking 










0.003 


Ever 


299 


52.5 


336 


44.4 




Never 


270 


47.5 


420 


55.6 




Drinking 










< 0.001 


Ever 


101 


17.9 


215 


28.4 




Never 


463 


82.1 


541 


71.6 





and GG genotype were 93.70%, 6.14%, and 0.15%, 
respectively. No significant difference was observed in 
the distribution of either genotypes or alleles of this 
SNP between cases and controls. Six tag SNPs we geno- 
typed were all in Hardy- Weinberg equilibrium 
[rs8087945 (x 2 = 0.330, P = 0.566), rsl2456774 (x 2 = 
0.438, P = 0.508), rsl2457731 (x 2 = 0.236, P = 0.627), 
rsl2958098 (x 2 = 0.757, P = 0.384), rs4800136 (x 2 = 
0.047, P = 0.829) and rs4800417 (x 2 = 1.320, P = 
0.251)]. The genotype analysis showed that the minor 
allele frequencies of rs8087945, rsl2456774, rsl2457731, 
rsl2958098, rs4800136 and rs4800417 were 22.41%, 
29.96%, 5.10%, 47.14%, 3.91% and 19.34% in the cases 
and 22.00%, 29.93%, 4.37%, 48.99%, 4.03% and 18.66% in 
the controls, respectively. No significant difference was 
observed in genotypes or allele frequencies of the six tag 
SNPs between case and control groups either before or 
after adjusting for age, sex, education, smoking, and 
drinking history (Table 3). Side by side r /D' plot for six 
tag SNPs was shown in the additional file 1. By consid- 
ering both D' and r 2 , we analyzed two haplotypes as pre- 
sented in the Table 4. For example, strong LD was 
observed between rs8087945 and rsl2456774 (D' = 1, r 2 
= 0.12). The common haplotype within this block was 
AA (rs8087945-rsi2456774)- Comparison of haplotype fre- 
quencies between case and control groups demonstrated 
that both the haplotype AG (rs808 7945- r si2456774) and GA 
(rs8087945-rsi2456774) were associated with the decreased 
risk of TB, with the adjusted OR(95% CI) of 0.34(0.27- 
0.42) and 0.22(0.16-0.29), respectively (Table 4). A 
strong linkage disequilibrium was also observed between 
rs4800417 and rs8087945 (D' = 0.99, r 2 = 0.78). As com- 
pared with the common haplotype CA (rs480 04i7. rS 8087945). 
the haplotype TG( rs480 04i7-rs8087945) was associated with a 
decreased risk (aOR = 0.64, 95%CI: 0.50-0.81) whereas 
the haplotype CG( rs480 04i7-rs8087945) was related to an 
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Table 3 Distribution of genotypes in cases and controls and their risks with pulmonary tuberculosis 





Case 


Control 






P 


aOR (95%CI) a 


pa 




Model 






SNP N 


% 


N 


% 


cOR (95%CI) 


Model b 


OR(95%CI) 


P 


rs8087945 A > G 






















AA 338 


60.36 


459 


61.20 


1 




1 




Dom 


l .02(0.80- 


.29) 


0.885 


AG 193 


34.46 


252 


33.60 


1.04(0.82-1.31) 


0.742 


1.04(0.81-1.33) 


0.763 


Rec 


0.88(0.52-' 


.48) 


0.619 


GG 29 


5.18 


39 


5.20 


1.01(0.61-1.67) 


0.970 


0.89(0.52-1.51) 


0.660 


Add 


0.99(0.82-' 


.21) 


0.946 


rsl 2456774 A 


> G 






















AA 272 


48.66 


372 


49.60 


1 




1 




Dom 


1.01(0.80-' 


.27) 


0.942 


AG 239 


42.75 


307 


40.93 


1.06(0.85-1.34) 


0.594 


1.05(0.83-1.34) 


0.675 


Rec 


0.81 (0.54-' 


.21) 


0.299 


GG 48 


8.59 


71 


9.47 


0.92(0.62-1.38) 


0.670 


0.83(0.55-1.26) 


0.376 


Add 


0.96(0.81-' 


.15) 


0.683 


rs12457731 A 


> G 






















AA 524 


90.66 


691 


91.52 


1 




I 




Dom 


1.13(0.76-' 


.69) 


0.544 


AG 49 


8.48 


62 


8.21 


1.04(0.70-1.54) 


0.836 


1.05(0.70-1.59) 


0.81 1 


Rec 


3.42(0.65-1 


8.02) 


0.148 


GG 5 


0.86 


2 


0.27 


3.30(0.64-17.05) 


0.155 


3.43(0.65-18.09) 


0.146 


Add 


1.19(0.82-1 


.71) 


0.36 


rsl 2958098 C 


>T 






















CC 162 


28.98 


1 99 


26.82 


1 




1 




Dom 


0.87(0.68-' 


.13) 


0.299 


CT 267 


47.76 


359 


48.38 


0.91(0.70-1.19) 


0.497 


0.88(0.67-1.15) 


0.341 


Rec 


0.94(0.72-' 


■24) 


0.677 


TT 130 


23.26 


184 


24.80 


0.87(0.64-1.18) 


0.364 


0.87(0.63-1.20) 


0.387 


Add 


0.93(0.79-' 


.09) 


0.370 


rs48001 36 A > C 






















AA 520 


92.36 


696 


92.06 


1 




1 




Dom 


0.94(0.61-' 


.44) 


0.776 


AC 42 


7.46 


59 


7.81 


0.95(0.63-1.44) 


0.818 


0.92(0.59-1.42) 


0.702 


Rec 


2.39(0.15-39.23) 


0.543 


CC 1 


0.18 


1 


0.13 


1.34(0.08-21.45) 


0.837 


2.37(0.14-38.99) 


0.546 


Add 


0.96(0.63-' 


.45) 


0.838 


rs4800417 C > T 






















CC 366 


65.24 


503 


66.80 


1 




1 




Dom 


1.09(0.85-' 


.38) 


0.505 


CT 173 


30.84 


219 


29.08 


1.09(0.85-1.38) 


0.502 


1.11(0.86-1.42) 


0.434 


Rec 


0.91(0.50-' 


.66) 


0.763 


TT 22 


3.92 


31 


4.12 


0.98(0.56-1.71) 


0.931 


0.94(0.52-1.72) 


0.844 


Add 


1.05(0.85-' 


.29) 


0.646 



a: Adjusted for age, sex, education, smoking and drinking 

b: Dom, dominant model; Rec, recessive model; Add, additive mode 



increased risk of TB (OR = 3.19, 95%CI: 2.26-4.51) 
(Table 4). 

Discussion 

A puzzling feature of TB is that only a small proportion 
of infected persons will develop active diseases during 
their lifetimes [15], though nearly one third of global 
populations have been latently infected with the 



pathogen [2]. Host genetic factors can explain, at least 
in part, why some people resist infection more success- 
fully than others [16,17]. Recently, a GWA study from 
Ghana and Gambia identified rs4331426 on the chromo- 
some 18qll.2, as a susceptibility locus associating with 
the risk of TB [11]. Till now, this is the only GWA 
study relating to the susceptibility of TB, suggesting that 
a new non-MHC locus can be identified in an infectious 



Table 4 Haplotype frequencies and the risks of pulmonary tuberculosis 



Haplotypes 


Case 


Control 


cOR (95% CI) 


P 


aOR (95% Cl) a 


pa 




N (%) 


N (%) 










rs8087945- rs 12456774 














Ars8087945Ars1 2456774 


746(64.5) 


725(47.9) 


1 




I 




A r s8087945Grsl2456774 


159(13.8) 


454(30.0) 


0.34(0.28-0.42) 


< 0.0001 


0.34(0.27-0.42) 


< 0.0001 


Grs8087945 A rs 1 2456774 


75(6.5) 


333(22.0) 


0.22(0.17-0.29) 


< 0.0001 


0.22(0.16-0.29) 


< 0.0001 


Grs808794sG r sl 2456774 


176(15.2) 


0(0) 




0.994 




0.994 


rs4800417-rs8087945 














Qs480041 7^8087945 


813(70.3) 


1 1 76(77.8) 


1 




I 




T r s480041 7^8087945 


126(10.9) 


278(1 8.4) 


0.66(0.52-0.82) 


0.0003 


0.64(0.50-0.81) 


0.0003 


Qs480041 7G r s8087945 


126(10.9) 


55(3.6) 


3.31(2.38-4.61) 


< 0.0001 


3.19(2.26-4.51) 


< 0.0001 


Trs480041 7^8087945 


91(7.9) 


2(0.2) 


43.88(13.84-139.07) 


< 0.0001 


48.08(15.02-153.92) 


< 0.0001 



a: Adjusted for age, sex, education, smoking and drinking. 



Dai et al. BMC Infectious Diseases 201 1, 11:282 
http://www.biomedcentral.eom/1 471-2334/1 1 /282 



Page 5 of 6 



disease caused by a highly polymorphic pathogen even 
in African populations [11]. The identified variant of 
SNP rs4331426 is common in the African, but is much 
rarer in other populations. No data of this SNP have 
been published yet in association with TB from other 
areas of the world. Considering the limitation of exten- 
sive genetic diversity and shorter LD ranges in African 
populations, we performed a study to validate this find- 
ing in China by searching for tag SNPs on the chromo- 
some 18qll.2 in the 100 Kbp up and down stream of 
the SNP rs4331426. To our knowledge, since the publi- 
cation of the GWA study by Thye et al [11], our work 
is the first one to explore the role of genetic polymorph- 
isms in this region on the susceptibility to TB. Unfortu- 
nately, we observed no significant association between 
TB risk and selected SNPs individually. One possible 
explanation might be the heterogeneity of populations, 
which can be confirmed by the disparity of genotype 
frequency [18]. Another explanation was that we only 
detected tag SNPs on the chromosome 18qll.2 within 
100 Kbp in the up and down stream of the SNP 
rs4331426, which could only represent a relatively nar- 
row scope of the genetic loci. Even though none of poly- 
morphisms we investigated were associated with TB in 
the single-point locus analysis, we found the haplotypes 
within this block might be associated with the altered 
risks of TB. For example, compared to individuals carry- 
ing the common haplotype A rs8 o 879 4 5 A rsl245677 4, those 

With A rs 8087945G r sl2456774 Or G r s8087945A r sl2456774 had 

a significantly decreased risk. We should notice that in 
this study we only analyzed two haplotypes. Other hap- 
lotypes covering more SNPs might also contribute to 
the risk of TB. 

Interestingly, chromosome 18qll.2 is a gene-desert 
region that is punctuated by evolutionarily conserved 
domains with regulatory potential [11]. Neither 
rs8087945 nor rsl2456774 is located inside any gene or 
in the regulatory sequence. The nearest genes to these 
SNPs are GATA6, CTAGE1, RBBP8 and CABLES1, as 
well as a number of as yet unannotated open reading 
frames. Additional studies are required to ascertain their 
functional significance and any possible counterbalan- 
cing selective pressures. In addition, it must be noted 
that the association found in China could be popula- 
tion-specific; however, it could also be a false-positive 
result. For this reason, it is important that these findings 
should be replicated to confirm the association in other 
areas of China. Future work is needed to explore the 
nearest genes as well as a number of as yet unannotated 
open reading frames around this region. 

Conclusions 

Susceptibility locus of rs4331426 identified in the Afri- 
can population could not be validated in the Chinese 



population. Even though none of genetic polymorphisms 
we investigated was associated with TB in the single- 
point analysis, the haplotypes might contribute to the 
susceptibility to TB in the Chinese Han population. 
Additional studies are required to ascertain the causative 
variant, its functional significance and any possible 
counterbalancing selective pressures. 

Additional material 

/ 

Additional file 1: Side by side r 2 /D' plot for selected tag SNPs 

Generated by Haploview software. The 5' and 3' ends of the six SNPs are 
indicated. D' values are shown on the squares. The colors of the squares 
represent r 2 values, with dark being r 2 = 1, and white being r 2 = 0. 
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